Skip to content

Add Blackwell Cuda GB10 Support#154

Open
snmabaur wants to merge 1 commit intodavidamacey:masterfrom
swissnationalmuseum:feature/cuda-gb10-blackwell-support
Open

Add Blackwell Cuda GB10 Support#154
snmabaur wants to merge 1 commit intodavidamacey:masterfrom
swissnationalmuseum:feature/cuda-gb10-blackwell-support

Conversation

@snmabaur
Copy link
Copy Markdown

  • Create Dockerfile.blackwell
  • Set fixed requirement of hugginface_hub
  • Create new docker-compose.blackwell.yml
  • Check in opentranscibe.sh if it’s a Blackwell GPU and load new docker-compose.blackwell.yml
  • Update update_gpu_stats functio for Blackwell GPU

Pull Request

Description

Brief description of what this PR does

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Code refactoring
  • Performance improvement

Changes made

  • Create Dockerfile.blackwell
  • Set fixed requirement of hugginface_hub
  • Create new docker-compose.blackwell.yml
  • Check in opentranscibe.sh if it’s a Blackwell GPU and load new docker-compose.blackwell.yml
  • Update update_gpu_stats functio for Blackwell GPU

Testing

  • I have tested these changes locally
  • I have added/updated tests for my changes
  • All existing tests pass
  • I have tested with different audio/video formats (if applicable)

Frontend changes (if applicable)

  • Changes work in light and dark mode
  • Changes are responsive on mobile devices
  • No console errors or warnings

Backend changes (if applicable)

  • API endpoints are properly documented
  • Database migrations are included (if needed)
  • Error handling is implemented
  • Logging is appropriate

Documentation

  • I have updated the README if needed
  • I have updated relevant documentation
  • Code is properly commented

Screenshots (if applicable)

Add screenshots to help explain your changes

Additional notes

Any additional information that reviewers should know

- Create Dockerfile.blackwell
- Set fixed requirement of hugginface_hub
- Create new docker-compose.blackwell.yml
- Check in opentranscibe.sh if it’s a Blackwell GPU and load new docker-compose.blackwell.yml
- Update update_gpu_stats functio for Blackwell GPU
@davidamacey
Copy link
Copy Markdown
Owner

Hey Alex, thanks so much for putting this together! Blackwell/DGX Spark support is exactly the kind of thing I've been wanting to add — the unified memory quirks and the NVRTC SM_121 compatibility issues are real pain points that I haven't had hardware to tackle myself.

A quick heads-up on timing: we're currently deep into the 0.4.0 release cycle on our feat/search-and-rag branch, which has diverged significantly from master (new ASR provider architecture, PyAnnote GPU optimizations, search/RAG features, etc.). That branch is where we're landing before the next release.

Once we cut 0.4.0 and merge to master, integrating this would be a great follow-up. In the meantime, I'd love it if you could keep an eye on how the 0.4.0 changes interact with your Blackwell setup — there are some GPU-side changes that may actually help with your use case, and it'd be great to have you test once we're closer to merging.

Really appreciate the contribution — and the clever workaround for the jiterator/fbank crash is something I hadn't seen documented anywhere else.

@davidamacey
Copy link
Copy Markdown
Owner

I've gone through the diff in detail and put together an integration plan for merging this into feat/search-and-rag: https://gist.github.com/davidamacey/d661109796840fc00187d165c52835dd

The short version: the additive pieces (Dockerfile.blackwell, docker-compose.blackwell.yml, the GPU stats fallback logic) are solid and I want to include all of them. There are a few things that'll need adjusting before they can land on the 0.4.0 branch:

  • docker-compose.yml container_name removals — the branch still has those and various scripts depend on them by name, so I'll need to drop that part of the diff
  • utility.py — the branch has a significantly rewritten multi-GPU version of update_gpu_stats() (with utilization %, temperature, per-device tracking). The Blackwell cudaMemGetInfo fallback is great — I'll graft it into the branch's version rather than replace it
  • opentranscribe.sh detection — the file-existence check for docker-compose.blackwell.yml would activate the Blackwell overlay for all NVIDIA users once the file is in the repo. Planning to replace it with an actual nvidia-smi --query-gpu=compute_cap check for SM_12.x
  • huggingface_hub==0.23.5 pin — hard-pinning that globally would conflict with other deps; I'll move it into Dockerfile.blackwell only where it's actually needed
  • Dependency versions — the branch is on whisperx==3.8.1 and a custom pyannote fork with GPU optimisations; Dockerfile.blackwell will need to track those

Where your help would be most valuable:

  1. whisperx==3.8.1 on DGX Spark — can you verify it installs and runs correctly with the NVIDIA PyTorch base? Same --no-deps trick you're already using should work
  2. Compute cap detection — does nvidia-smi --query-gpu=compute_cap return 12.1 on your GB10? (Needed for the improved auto-detection)
  3. pyannote fork — does pyannote.audio @ git+https://github.com/davidamacey/pyannote-audio.git@gpu-optimizations install cleanly on top of the NVIDIA container? The fork has significant GPU speedups that would benefit your setup

Planning to integrate this into 0.4.1 right after the 0.4.0 release. Happy to loop you in early if you want to test against the branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants